Fir filter based upon squaring

ABSTRACT

An improved FIR filter based upon squaring is used to self-determine a filter constant equal to the sum-of-squares of the filter coefficients. An input signal is forced to zero for T samples, where T is the number of accumulator cells in an accumulator stage, and at the end of such zero samples the output from the filter is latched as the filter constant for use in filtering the normal input signal. The FIR filter may also be placed in a co-processor mode, using a FIFO register between the input of the FIR filter and a processor bus. A CPU on the bus initiates the co-processor mode and loads data into the FIFO. When the FIFO has data the data is read out and input to the FIR filter. The output of the FIR filter is placed on the processor bus. To determine the values of the filter coefficients loaded in the FIR filter, the data loaded by the CPU is an impulse signal having T−1 zero samples before and after an impulse sample, the output for each sample representing one of the filter coefficients.

BACKGROUND OF THE INVENTION

The present invention relates to digital filtering of electronicsignals, and more particularly to an improved finite impulse response(FIR) filter based upon squaring that uses the filter itself todetermine a constant equal to the sum of the squares of the filtercoefficients.

U.S. Pat. No. 5,561,616, issued to the present inventor on Oct. 1, 1996entitled “FIR Filter Based Upon Squaring”, illustrates a FIR filterbased on squaring that reduces the complexity of the filter as opposedto standard FIR filter convolutions. At the output of the FIR filter isa subtracting circuit which has as one of the inputs a constant equal tothe sum of the squares of the filter coefficients. The sum of thesquares of the coefficients is indicated as being determined in anon-real-time manner with minimal hardware, by pre-calculation, orexternally as an entry with each new set of coefficients as they areloaded.

Also once the set of coefficients is loaded into the filter, when thefilter is coupled to a processor bus where multiple devices may accessthe filter, it is sometimes desirable to be able to determine what thecoefficients of the filter are. A shadow set of values may be kept in acentral processing unit (CPU) coupled to the bus from which the FIRfilter was loaded, but this does not allow testing of the FIR filter byreading back the filter coefficients that were written. Or thecoefficient inputs of the filter itself may be tied into a separatefilter bus for reading back the filter coefficients, but this adds tothe complexity of the FIR filter.

What is desired is an improved FIR filter based on squaring where themodification provides a way to easily determine the sum of the squaresof the coefficients, while being accessible to external devices on aprocessor bus.

BRIEF SUMMARY OF THE INVENTION

Accordingly the present invention provides an improved FIR filter basedon squaring that has the capability of computing the sum of the squaresof the filter coefficients for use by the filter as well as providingthe ability for an external device coupled to the bus to determine thecoefficients currently loaded into the FIR filter. The input signal isforced to zero for T samples, where T is the number of accumulatorstages, or taps, of the filter. At the end of T zero samples the outputis latched to store the constant for use in the final subtraction stageof the filter. The filter may also be used as a co-processor byproviding a FIFO interface with a CPU bus. The CPU, when it initiatesthe co-processor mode, disables the external input signal and itselfclocks and loads data values into the FIFO. When the FIFO has data, thenit is read out into the input of the filter as a data sample and a localfilter clock is generated. The output of the filter is returned to theCPU via the bus. To determine what the loaded filter coefficients are,the data loaded into the FIFO by the external device is an impulsesignal represented by T−1 zero samples both before and after an impulsesample. The outputs from the filter then represent each of the filtercoefficients.

The objects, advantages and other novel features are apparent from thefollowing detailed description when read in conjunction with theappended claims and attached drawing figures.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING

The FIGURE is a block diagram view of an improved FIR filter accordingto the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Referring now to the FIGURE a multiplexer 30 is placed prior to an inputsquaring circuit 12 of a FIR filter 10 based on squaring. Themultiplexer 30 has as inputs an external filter input signal, a zerosignal and the output of a first-in, first-out (FIFO) register 32. Theinput of the FIFO 32 is coupled to a processor bus 34 to which also iscoupled a central processing unit (CPU) 36. A synchronous clock gatecircuit 38 has as an input a system clock C_(s) and a clock gate from asecond multiplexer 31 which selects between an external clock enablesignal and a FIFO_has_data signal depending on a mode control input, andprovides as an output a filter clock C_(k). A mode control line iscoupled between the bus 34 and the multiplexers 30, 31. Other devices 40that access the FIR filter 10 may also be coupled to the bus 34.

The signal y(t) input to the FIR filter 10 via the multiplexer 30 isinput to a squaring circuit 12 and the accumulator stages 14, the outputof which is input to an input signal accumulator 18. The output from theinput signal accumulator 18 is input to the accumulator stages 14together with the input signal y(t) and the filter coefficients C(0, . .. , T−1) which may be generated by the CPU 36 and loaded via the bus 34.The outputs from the input signal accumulator 18 and the accumulatorstages 14 are input to a first subtractor 24, the output of which isinput to a second subtractor 26 and a latch 42. The output of the latch42 is the second input to the second subtractor 26, the output of whichprovides the output signal z(t) that also is coupled to the bus 34. Atimer 44 provides a load command to the latch 42, the timer being resetby a coefficient write command from the CPU 36 to the accumulationcircuit 14.

A two-dimensional filter is conceptually a single one-dimensional filterwith a large number of zero coefficients between non-zero regionscorresponding to a serially scanned video frame structure and is usuallyconstructed by inserting delays between groups of taps in theaccumulation stages. In the FIR filter 10 of the present invention theaccumulation path cannot be delayed between groups of paths, as this iseffectively adding zero at each tap. If this is done, then:

(y(t)+C(T−n))²=0

or y(t)=−C(T−n), obviously not true. To allow delays to be inserted,each group of taps must be regarded as a separate filter so that thesquare FIR filter equations

z(t)=(S(T,t)−S(0,t)−K)/2

S(0,t)=S(0,t−1)+y(t)²

are evaluated at the start and end of each group. The subtraction of K,where K=Sum(C(i)², (i,0,T−1)), is not required at an intermediate stageof the accumulation stages 14−the overall sum-of-squares of all theseparate filter coefficients may be subtracted at one point at the end.

K may be most simply evaluated by observing that the sum-of-squares isindependent of the input signal, and so K may be regarded as an unknownconstant offset. z(t)=Sum(y(t−i)*C(i), (i,0,T−1)) generates a zerooutput at time t if y(t−i)=0 for all i from 0 to T−1. By not initiallysubtracting out K, forcing the input signal to zero implies thatS(T,t)−S(0,t) is the value of K at time t. This may be achieved inpractice by circuitry that forces the y sample inputs to zero for Tsamples, where T is the number of stages in the accumulation stages 14,and then latching the value. If a sample state machine, which may beimplemented by the CPU 36, represents discrete sample counts in asequence with sample_count=0 at the start, then the formal descriptionmay be described:

//sample_count state machine//

if coefficient_has_been-changed then

sample_count←0 //start or restart sequence//

else

 if sample_count≦T then //increment to T+1 and stop//

sample_count←sample_count+1

end if

end if

What this outlines is that, when the coefficients for the FIR filter 10are changed, then coefficient_has_been changed is set to one and thesample_count is reset to zero. For each subsequent sample countcoefficient_has_been_changed is at zero and the sample_count incrementsby 1 until the count is greater than T.

//action based upon sample_count value// if sample_count <T then y(t) =0 //force 0 into filter// else y(t) = fi1ter_system_input //normalfilter operation// if samp1e_count == T then K <- S(T,t) - S(O, t)//latch the value of K// endif endif

The conditional “←” assignments are usually performed with a D-typeflip-flop with input enable. In practice additional pipelining thatdelays the emergence of the results S(T,t)−S(0,t) may require that thelatch signal be delayed correspondingly. The timer 44 is designed tograb K at state T and stop at state T+1, which no longer forces theinput y to zero and which subsequently generates the correct convolutionoutput z(t). Any write of a new coefficient restarts the timer sequenceagain. This avoids having to read the stored coefficients to calculate Kby a separate means, which also simplifies the circuit in eachaccumulation cell in the accumulation stages 14. Therefore as long asthe sample_count is less than or equal to T, the control line to themultiplexer 30 selects the zero input for feeding into the FIR filter10. When the sample_count is equivalent to T, then the output from thefirst subtractor 24 is captured in the latch 42 by the load signal fromthe timer 44 so that at T+1 the input y(t) to the FIR filter 10 is theexternal filter input signal and the second input to the secondsubtractor 26 is K from the latch.

The FIR filter 10 may work as a co-processor with the CPU by using theFIFO 32 between the bus 34 and the filter 10 input. In this mode themultiplexer 30 is placed in the co-processor mode by the mode controlsignal, and the FIR filter input is driven from the FIFO output whilethe local clock C_(k) is only enabled by the synchronous clock gate 38when the FIFO has data. In this way each write by the CPU 36 at aparticular address forces a single data sample value into the FIFO 32and allows the filter 10 to clock it in, which in turn empties the FIFOby one, so only a single filter clock is generated by each write.

The synchronous clock gate 38 may be implemented by a D-type flip-flopclocked by the system clock C_(s), with the output of the multiplexer 31as input and the local filter clock C_(k) as output, the local filterclock being fed back through a delay to asynchronously clear theflip-flop and determine the local filter clock pulse width. In normalmode the external clock enable is applied to the flip-flop and the localfilter clock is the system clock. In the co-processor mode theFIFO_has_data is applied to the flip-flop and the local filter clock isgenerated synchronous with the system clock only when the FIFO 32 hasbeen loaded by the CPU 36 with data.

The output of the filter z(t) is provided on the bus 34 so that the CPU36 may read the filter output before writing the next data sample to theFIFO 32. This allows rapid linear convolution to be performed within asoftware program by using the squaring FIR filter 10 as a co-processor.It also allows the use of a system clock (FIFO read clock) at adifferent frequency from the CPU clock (FIFO write clock) as long as theFIFO can transfer data between its asynchronously related read and writeclocks. Thus the FIR filter 10 may be used for realtime convolutionapplications independently of the CPU clock when not in the co-processormode. More formally

//FIFO write clock domain// FIFO_input = CPU_data_bus ifFIFO_write_from_CPU then write_into_FIFO() //affects FIFO_has_data//endif //FIFO read clock (filter clock) domain// if coprocessor_mode thenif FIFO_has_data then read_from_FIFO() //affects FIFO_has_data// endiffilter_system_input = FIFO_output filter_clock_enable = FIFO_has_dataelse filter_system_input = external_sample_input //y(t)//filter_clock_enable = external_clock_enable //C_(k)// endif

As a practical application this may be used to read back coefficientsonce programmed to help diagnose hardware problems. If no such mechanismis provided, then indirect read back of coefficients may, as for anylinear convolution system, be provided by an external CPU by using theco-processor feature to inject an isolated impulse at some time t,isolated in the sense that each impulse is surrounded by at least T−1samples of zero value on either side, and then observing the outputswhich are:

Output(t+i)=C(i)*y_impulse(t), 0≦i<T

The T output values may be divided by the impulse size to obtain each ofthe T coefficient values C(i), providing the coefficient read backcapability.

Thus the present invention provides an improved FIR filter based onsquaring that self-determines a constant equal to the sum-of-squares ofthe filter coefficients by forcing the input signal to zero for Tsamples, where T equals the number of accumulator cells in theaccumulation stage, and latching the result after T samples as theconstant, which filter may also be used as a co-processor to determinewhat coefficient filters are loaded into the filter by inserting a FIFObetween a CPU bus and the input to the filter for loading data from theCPU into the filter, the data being in the form of an impulse signal todetermine the filter coefficients.

What is claimed is:
 1. An improved FIR filter based on squaring of thetype having an input stage for squaring an input signal, an inputaccumulator for accumulating the squared input signal, an accumulatorstage having T cells for combining the input signal, the squared inputsignal and the filter coefficients, a first subtractor for subtractingthe squared input signal from the output of the accumulator stage and asecond subtractor for subtracting a constant equal to the sum-of-squaresof the filter coefficients from the output of the first subtractor toproduce a filtered output signal, wherein the improvement comprises:means for forcing the input signal to zero for T samples, where T is thenumber of accumulator cells in the accumulator stage; and means forlatching the output from the first subtractor as the constant for inputto the second subtractor after T zero samples of the input signal havebeen input to the filter.
 2. The improved FIR filter as recited in claim1 further comprising: a first-in, first-out (FIFO) register coupledbetween the input of the filter and a processor bus; a processor coupledto the processor bus, the processor having the ability to put the filterinto a co-processor mode, at which time the processor loads data intothe FIFO which is subsequently read out as an input to the filter, theoutput of the filter also being coupled to the processor bus.